AITopics | explicit memory

Collaborating Authors

explicit memory

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Explicit v.s. Implicit Memory: Exploring Multi-hop Complex Reasoning Over Personalized Information

Zhang, Zeyu, Zhang, Yang, Tan, Haoran, Li, Rui, Chen, Xu

arXiv.org Artificial IntelligenceAug-20-2025

In large language model-based agents, memory serves as a critical capability for achieving personalization by storing and utilizing users' information. Although some previous studies have adopted memory to implement user personalization, they typically focus on preference alignment and simple question-answering. However, in the real world, complex tasks often require multi-hop reasoning on a large amount of user information, which poses significant challenges for current memory approaches. To address this limitation, we propose the multi-hop personalized reasoning task to explore how different memory mechanisms perform in multi-hop reasoning over personalized information. We explicitly define this task and construct a dataset along with a unified evaluation framework. Then, we implement various explicit and implicit memory methods and conduct comprehensive experiments. We evaluate their performance on this task from multiple perspectives and analyze their strengths and weaknesses. Besides, we explore hybrid approaches that combine both paradigms and propose the HybridMem method to address their limitations. We demonstrate the effectiveness of our proposed model through extensive experiments. To benefit the research community, we release this project at https://github.com/nuster1128/MPR.

information, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2508.1325

Country:

North America > United States (0.48)
Asia (0.46)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.66)

Add feedback

TypedThinker: Typed Thinking Improves Large Language Model Reasoning

Wang, Danqing, Ma, Jianxin, Fang, Fei, Li, Lei

arXiv.org Artificial IntelligenceOct-2-2024

Despite significant advancements in the reasoning capabilities of Large Language Models (LLMs), the lack of diverse reasoning solutions often makes them trapped in a limited solution search area. In this paper, we propose TypedThinker, a novel framework that enhances LLMs' problem-solving abilities by incorporating multiple reasoning types (deductive, inductive, abductive, and analogical). Our analysis across four benchmarks reveals that different reasoning types uniquely solve distinct sets of problems, highlighting the importance of diverse thinking approaches. TypedThinker addresses two key challenges: selecting appropriate reasoning types for given problems and effectively implementing specific reasoning types. Through self-training on successful experiences, TypedThinker learns an implicit policy for reasoning type selection and application. Experimental results demonstrate significant improvements over baseline models, with accuracy increases of 3.4% for Mistral 7B and 16.7% for LLaMA3 8B across four reasoning benchmarks. Notably, TypedThinker shows effective generalization to new benchmarks and can further enhance the reasoning capability of powerful models like GPT-4o. The code is released at https://github.com/dqwang122/ThinkHub.

reasoning, reasoning type, typedthinker, (15 more...)

arXiv.org Artificial Intelligence

2410.01952

Country:

Asia > Singapore (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)
Asia > China > Gansu Province > Lanzhou (0.04)
(5 more...)

Genre: Research Report > New Finding (0.87)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

$\text{Memory}^3$: Language Modeling with Explicit Memory

Yang, Hongkang, Lin, Zehao, Wang, Wenjin, Wu, Hao, Li, Zhiyu, Tang, Bo, Wei, Wenqiang, Wang, Jinbo, Tang, Zeyun, Song, Shichao, Xi, Chenyang, Yu, Yu, Chen, Kai, Xiong, Feiyu, Tang, Linpeng, E, Weinan

arXiv.org Artificial IntelligenceJul-1-2024

The training and inference of large language models (LLMs) are together a costly process that transports knowledge from raw data to meaningful computation. Inspired by the memory hierarchy of the human brain, we reduce this cost by equipping LLMs with explicit memory, a memory format cheaper than model parameters and text retrieval-augmented generation (RAG). Conceptually, with most of its knowledge externalized to explicit memories, the LLM can enjoy a smaller parameter size, training cost, and inference cost, all proportional to the amount of remaining "abstract knowledge". As a preliminary proof of concept, we train from scratch a 2.4B LLM, which achieves better performance than much larger LLMs as well as RAG models, and maintains higher decoding speed than RAG. The model is named $\text{Memory}^3$, since explicit memory is the third form of memory in LLMs after implicit memory (model parameters) and working memory (context key-values). We introduce a memory circuitry theory to support the externalization of knowledge, and present novel techniques including a memory sparsification mechanism that makes storage tractable and a two-stage pretraining scheme that facilitates memory formation.

explicit memory, knowledge, memory 3, (15 more...)

arXiv.org Artificial Intelligence

2407.01178

Country:

Asia > China > Beijing > Beijing (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Middle East > Jordan (0.04)
(7 more...)

Genre: Research Report > Promising Solution (0.47)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Associative Transformer

Sun, Yuwei, Ochiai, Hideya, Wu, Zhirong, Lin, Stephen, Kanai, Ryota

arXiv.org Artificial IntelligenceJan-30-2024

Sparse knowledge association can find resonance with the neuroscientific grounding of the Global Emerging from the pairwise attention in conventional Workspace Theory (GWT) (Baars, 1988; Dehaene et al., Transformers, there is a growing interest 1998; VanRullen & Kanai, 2020; Juliani et al., 2022). GWT in sparse attention mechanisms that align explains a fundamental cognitive architecture for working more closely with localized, contextual learning memory in the brain where diverse specialized modules compete in the biological brain. Existing studies such as to write information into a shared workspace through the Coordination method employ iterative crossattention a communication bottleneck. The bottleneck facilitates the mechanisms with a bottleneck to enable processing of content-addressable information using attention the sparse association of inputs. However, guided by contents in the shared workspace (Awh et al., these methods are parameter inefficient and fail 2006; Gazzaley & Nobre, 2012). in more complex relational reasoning tasks. To this end, we propose Associative Transformer A bottleneck guides models to generalize in a manner consistent (AiT) to enhance the association among sparsely with the underlying data distribution through inductive attended input patches, improving parameter efficiency biases of sparsity (Baxter, 2000; Goyal & Bengio, 2022), and performance in relational reasoning resulting in superior performance in tasks such as relational tasks.

explicit memory, hopfield network, transformer, (15 more...)

arXiv.org Artificial Intelligence

2309.12862

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Explicit-Blurred Memory Network for Analyzing Patient Electronic Health Records

Chakraborty, Prithwish, Wang, Fei, Hu, Jianying, Sow, Daby

arXiv.org Machine LearningNov-14-2019

In recent years, we have witnessed an increased interest in temporal modeling of patient records from large scale Electronic Health Records (EHR). While simpler RNN models have been used for such problems, memory networks, which in other domains were found to generalize well, are underutilized. Traditional memory networks involve diffused and non-linear operations where influence of past events on outputs are not readily quantifiable. We posit that this lack of interpretability makes such networks not applicable for EHR analysis. While networks with explicit memory have been proposed recently, the discontinuities imposed by the discrete operations make such networks harder to train and require more supervision. The problem is further exacerbated in the limited data setting of EHR studies. In this paper, we propose a novel memory architecture that is more interpretable than traditional memory networks while being easier to train than explicit memory banks. Inspired by well-known models of human cognition, we propose partitioning the external memory space into (a) a primary explicit memory block to store exact replicas of recent events to support interpretations, followed by (b) a secondary blurred memory block that accumulates salient aspects of past events dropped from the explicit block as higher level abstractions and allow training with less supervision by stabilize the gradients. We apply the model for 3 learning problems on ICU records from the MIMIC III database spanning millions of data points. Our model performs comparably to the state-of the art while also, crucially, enabling ready interpretation of the results.

ebmrnn, explicit memory, memory bank, (13 more...)

arXiv.org Machine Learning

1911.06472

Genre: Research Report (1.00)

Industry: Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Causal interpretation rules for encoding and decoding models in neuroimaging

Weichwald, Sebastian, Meyer, Timm, Özdenizci, Ozan, Schölkopf, Bernhard, Ball, Tonio, Grosse-Wentrup, Moritz

arXiv.org Machine LearningNov-15-2015

Causal terminology is often introduced in the interpretation of encoding and decoding models trained on neuroimaging data. In this article, we investigate which causal statements are warranted and which ones are not supported by empirical evidence. We argue that the distinction between encoding and decoding models is not sufficient for this purpose: relevant features in encoding and decoding models carry a different meaning in stimulus- and in response-based experimental paradigms. We show that only encoding models in the stimulus-based setting support unambiguous causal interpretations. By combining encoding and decoding models trained on the same data, however, we obtain insights into causal relations beyond those that are implied by each individual model type. We illustrate the empirical relevance of our theoretical findings on EEG data recorded during a visuo-motor learning task.

artificial intelligence, interpretation rule, machine learning, (14 more...)

arXiv.org Machine Learning

doi: 10.1016/j.neuroimage.2015.01.036

1511.0478

Country:

North America > United States > California (0.46)
Europe > Germany > Baden-Württemberg (0.28)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback